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Claims 1-33 are pending in this application. Claims 1, 5, 6, 10, 20 and 30 have 
been amended. Claims 34-38 are added. No new matter has been introduced thereby. 

In the Office Action mailed 04/1 1/2006, the Examiner objected to the information 
disclosure statement filed on May 24, 2004 as it failed to comply with the provisions of 37 



98, and MPEP §609. A copy of the information disclosure statement, in 



C.F.R. 1.97, 1 

accordance w^th the above provisions, and certification requirement for statements under 37 
C.FJL 1 .97(e^is hereby enclosed. Accordingly, the Examiner's objections are now moot. 

Claims 1, 5, 10, 13-18, and 30-34 are rejected under 35 U.S.C §102(e) as being 
anticipated by Milic-Frayling et al. (US 2006/0059138). Claims 2-4 and 1 1-12 are rejected 
under 35 U.S.(p. § 103(a) as being unpatentable over Milic-Frayling et al. Claims 6-9 and 1 9-20 
were rejected jnder 35 U.S.C. § 103(a) as being unpatentable over Milic-Frayling in view of 
"Creating a CD-ROM: Overview of the product field (CD-ROM authoring and data retrieval 
software packages; includes company directory and related article on resources for doing ' 
research)", Buyers Guide by Bernard Banet, Seybold Report on Desktop Publishing , v7, n6, 
February 1, 1993. As noted, the cited references, alone or in combination, do not disclose or 
suggest the present method of identifying entities having expertise in one or more subjects in 
health care fields as recited in claim 1, as amended. The method includes querying a database 
for documents relevant to a subject, and calculating a first score for each relevant document. The 
method then determines entities affiliated with one or more relevant documents and calculates a 
second score fof each entity based on the one or more first scores of the one or more relevant 
documents affiljated with the entity. The method includes ranking expertise of the entities based 
on the respective second scores of the entities. 

iji contrast, the conventional method provided by Milic-Frayling et a), is merely a 
conventional sejreh. That is, Milic-Frayling, et al. merely relate to an "information highlight 
facility" for searched documents, i.e., conventional search. Additionally, Milic-Frayling et al. 
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may even "re f rank" a document based on a model of a user's interest (paragraph 0041). Such 
model is derived by monitoring the user's action and information provided by the task the user is 
performing, ejg., working on a project, sending an email, etc (paragraph 0042) and not the 
method of identifying entities having expertise in one or more subjects in health care fields in the 
manner claimed, as recited in claim 1, as amended. As merely an example, the entity can include 
a hospital, doctor, or others according to the present invention. Additionally, a focus of MihV 
Frayling et al on documents is consistent their aim to assist a user in "searching, browsing, and 
reading documents" (paragraph 0012). Similarly, their conclusion states that their invention can 
"assist the use| in evaluating the relevance of documents" (paragraph 0096), which is not related 
to the present jnvention. Accordingly, claim 1 is patentable over Milic Frayling et al. 

| The Examiner also cited Banet combined with Milic-Frayling et al to reject claims 
6-9, and 19-2Cj As noted, the cited references, alone or in combination also fail disclose or 
suggest the pnjsent method of identifying entities having expertise in one or more subjects in the 
manner claimed. Banet described features of CD-ROM retrieval software using fields such as 
"author", "datef', "title", "subject" as keywords to search for a document. In contrast, the method 
according to pdesent invention identifies entities having expertise in one or more subjects in 
health care fieljis as recited in claim 1 , as amended. The entities recited in clatm 6 and claim 1 9 
include an author or one or more institution from which the document emanated. Accordingly 
the entities reeled in claim 1 are an output of the query rather than a keyword used for searching. 
Accordingly, c^aim 1 is patentable over the cited references, alone or in combination. 
Corresponding bependent claims 2-29 and additional features cited therein, should also be 
alJowed based on at least the same reasons and others. 

The National Library of Medicine Internet homepage (www.webarchive.com 
from the year 2000), cited for the purpose of allegedly showing that Medlars is one of the 
medical databases does not cure the aforementioned deficiencies of Milic-Frayling et al. or 
Banet.. Accordingly, claim 1 is patentable over the cited references. Claims 2-29 and additional 
features cited therein, should also be allowed based on at least the same reasons and others. 
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j Claim 30, which disclosed a system for the method of identifying entities having 
expertise in one or more subjects should also be allowed based on the rationale as discussed for 
claim 1, and others. Accordingly, claim 30 is also patentable. 

Applicant added new claims 34-38. No new matter has been introduced thereby. 
Applicant asserts that newly added claim 34 is patentable over cited references. As shown, claim 
34 recites a method of assessing expertise associated with an entity in a subject in health care 
fields. As an example, the subject can be coronary bypass, nausea, stroke, and others. The 
method includ e querying a database (e.g., Pubmed) for documents relevant to the subject and 
determining a first set of entities associated with the relevant documents. Again, as an example, 
the entities could be authors, institutions, and others. The method also includes calculating a 
score (e.g., quantification, numerical estimate) for each entity in the first set of entities based on 
the number of Relevant documents associated with each entity. As an example, the method 
initially populates a database with, for example, the entity such as the institution and/or the 
author with associated scores or scores. The method includes populating a second database to 
include each of the entities in the first set of entities and the score associated with each of the 
entities in the first set of entities. 

: ^Jow that the database has been populated, a user can, for example set up a query 
to determine a desired institution and/or author or the like based upon the subject, which can 
coronary bypass or stroke, as an example. As provided by claim 34, the method includes 
receiving a queJy related to an entity. The method includes determining a second set of entities 
associated with me entity related to the query. The method includes retrieving fiom the second 
database the scqre associated with each entity in the second set of entities. The method includes 
representing to a user the scores of the entities in the second set of entities or a ranking of the 
entities in the second set of entities based on the scores of the entities in the second set of 
entities. The sc<j>re of an entity is indicative of the expertise in the subject associated with the 
entity. Such features are not suggested or disclosed in cited references. Accordingly, claim 34 
should be allowed. Dependent claims 35-38 are also allowable. Accordingly, all claims now 
pending in this ajpplication should be allowed for these reasons and others. 
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Applicant believes that all claims in this Application are in condition for 
allowance. T^e issuance of a formal Notice of Allowance at an early date is respectfully 
requested. 

If the Examiner believes a telephone conference would expedite prosecution of 
this application, please telephone the undersigned at 650-326-2400. 



ectfully submitted, 




1 T. Oga\ 
Reg. No. 37,692 



TOWNSEND and TOWNSEND and CREW LLP 
Two Embarcadero Center, Eighth Floor 
San Francisco' California 941 1 1-3834 
Tel: 650-326-2400 
Fax:415-576-0300 
RTO:wcf 

60843376 v1 
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I 01584388 I SUPPLIER NUMBER: 13440954 (THIS IS THE PULL TEXT) m 
I Creating A CD-ROM: overview of the product field, (cd-ROM authoring and 

w data retrieval software packages; includes company directory and related 

article jon resources for doing research) (Buyers Guide) 
Banet, Bernard 

Seybold Report on Desktop Publishing, v7, n6, p3(29) 

DOCUMEN^lvPE: Buyers Guide ISSN: 0889-9762 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT 

WORD COUNT: 17829 LINE COUNT: 01443 

TEXT * Last ! August, we published a broad overview of the cd-rom # publishing 
process, ihis month, we continue that discussion by characterizing some or 
the specific software tools, called authoring systems, that prepare ana 
index text*, data and graphics for inclusion on cd-rora discs. 

you don't need an authoring system to publish a cd-rom title. You 
could organize a collection of typesetting, word processing, graphics, 
database, (spreadsheet, audio and/or digital motion video files into dos 
directories or Mac folders, up to 700 megabytes' worth, and send them off 
to be premastered, mastered and replicated. The resulting cd-roms would, 
however, bresent certain problems to the user, such as how to find anything 
in all of jthose files and how to display, listen to, print, or otherwise 
utilize the data. 



qarden-varriety pc application software on the disc: a desktop publishing 
program, a database manager, a full -text search package, etc that would 
be able t6 access, manipulate, and output the information- This approacn, 
however, would run into further difficulties: the need to pay licensing 
fees for <*ach program on each disc; the discovery that programs designed 
for standard desktop computer applications are not optimized for cd-rom; 
and the realization that requiring a separate appl i cation program Tor eacn 
type of dita is not the way to create a coherent information product. 
The functions of retrieval software 

The alternative to putting only data on a cd-rom, or to including 
standard desktop programs on the disc, is to incorporate a specialized 
cd-rom retrieval program on each cd that can find and display the 
information the user requests, from whatever data formats are on the disc. 
Ideally, the user would be able to print and download the information as 
well. The Iretrieval software is analogous to the runtime module of a 
database lianagement system. m . . 

Retrieval software includes the search engine that reads the inaex 
files and (locates the sought-after information, and a user interface that 
presents the options and interprets the user f s choices. The cd-rom user 
interface prompts users to frame and refine search queries and to navigate 
through a large amount of text, data and graphic material without getting 
lost in hyperspace. # 

Building the disc . 
understanding the role of cd-rom retrieval programs, we turn to build, 
or authoring software. Cd-rom "authoring" programs prepare the data files 
for a cd-ifom title to be accessed by a specific retrieval program. The 
tools import text and other files, organize and index them so 
can be searched and displayed by the retrieval software . m 
that "authoring" does not imply creation of original material for 
used in the cd-rom industry, the term implies processing content 
from a variety of computer-readable files to make it accessible to a user, 
with the aid of the matching retrieval program, on a cd-rom-equipped 
computer, i * r 

In general, the authoring, or build, tools and the retrieval software 
are "made | for each other" and sold by the same vendor as a package. The two 
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complementary modules must at this point be considered together by the 

Pr ° SP U:lar process. The process of building a master disc is — 
in principle, regardless of the type of "formation it includes. The basic 
ireas are ai follows: * setting up, modifying or developing the user 
iSerftce Sog^am that will be the front endVor the retrieval software; * 
Building %Rf1!E'i database by defining 1n software, the database 
structure used to organize the content; * verifying that the files to be 
imDorted into the application are consistently and properly to™"!*?^' „, 
importing afd marking up the data to make it. fit the structure of the new 
™tle; ^verifying tnat the files have been imported as intended and 
flaaaina possible errors and inconsistencies; • Adding < hyper ) links 
li^ng ^e'daw elements, either manually or automatically; * indexing the 
data so that desired material can be found quickly and Silently, 
simulating.! from the hard disk drive, the functioning of the title with the 
retrieval software that will be used with it; 

" Debugging and revising the data and application as needed, and 

* outputting files for submission to in-house premastering or Con 

removable media) to premastering/mastering at a disc replication 
facility ("pressing plant') . , ft „ tirU'< 

These cd-rom data preparation procedures come after the title * 
database structure and retrieval interface have been designed conceptually. 
They also sometimes follow preliminary data conversion steps, sucn as 
digitizing of nondigital source materials (paper documents, for example} 
ana conversion of digital data such as word processing files to formats 
that the authoring software can process. The authoring package won * always 
provide all! the utilities for scanning pictures or page images, converting 
text on papfer to ascii characters, and stripping or rewriting codes in 
publishing files that are foreign to the indexing and retrieval utilities. 
Authoring packages vary considerably in the file formats and data types 
that they can import, and the way in which these files are further 
processed for compatibility with the retrieval software. 

For potential publishers, evaluation of the performance and ease of 
use of the 'data preparation tools themselves has been, in Practice, 
secondary tb the retrieval software, which determines the f unct! onal i ty and 
performance! of the title to be distributed. There is, however, variation 
also in the^ power, performance and ease of use of the authoring programs. 

Looking over the field . 

This aFticle will be an overview of some of the most widely used 
cd-rom authoring and retrieval packages and the features that differentiate 
one from another. This roundup isn't intended to be a buying guide or 
product evaluation, it is designed to help narrow the search for software 
that mighttoeet a particular cd-rom publishing need. The information comes 
from questionnaires filled out by vendors, vendor product summaries and 
interviews ' 

we will focus primarily on programs Thar can deal competently with 
possibly hundreds of megabytes of text and images derived from documents 
and books. !or with data from structured database records with defined 
fields, omiitting authoring packages with a multimedia (audio and motion 
video) emphasis to the exclusion of large amounts of text, those that are 
not optimised for cd-rom use, and those that do not have a licensing 
arrangement for including retrieval software on each cd-rom. 

we wiljl not include authoring and retrieval packages that are intended 
for specialized platforms such as Sony's eookware for the Multimedia cd-rom 
player (MMdn) and SEBAS, the Sony Electronic Book Authoring system for the 
Data DiscMan. The packages we list can prepare data for premastering into 
iso 9660 format, into Macintosh hfs format, or both, and have retrieval 
programs that run on one or more of the common desktop platforms: 
Intel/ms-dos, dos plus Windows, or Macintosh. . 

Retri^al programs that are marketed as engines only — that is, those 
without a user interface — are also not listed here. Fulcrum Technologies 
Ful/Text and Sony Electronic Publishing Company's FTR rf^-text retrieval) 
packages, tor example, supply a retrieval engine or software database 
server without a user interface. Retrieval engines such as Fulcrum s are 
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intended to [be embedded in other vendors' more complete retrieval offerings 
or are mated by the cd-rom publisher with a custom-programmed user 

1Pter 5 C aiso have not included the page T based Wysiwyg viewers, such as 
interleaf's iworldview. Northern Telecom's Helmsman, Thaumaturgy s eDDARS 
and Frame Technology's FrameViewer. If you have source files in a robust 
electronic form, these products are worth examining as alternatives to the 
more traditional, text-based approach. 

S^Tectin5 a cd^rom authoring and retrieval sofovare means poking for 
programs that can manage the kinds and quantity of data planned for the 
titTeTr seHes, and tfiat are suited to the expectations and expe "f^* of 
likely users, and the compirter platforms and configurations available to 

them "cd-rom' authoring and retrieval programs represent the confluence of 
several families of software. Each cd-rom authoring/retrieval package can 
borrow techniques from database management systems, ^^"^^^"^hina- 
management and full -text retrieval ; word processing and desktop P"^*^' 
presentation graphics; hypertext; video and computer games; computer-oasea 
training; and multimedia. . , 

Database models. The ancestors of some cd-rom retrieval packages, 
removed by several generations, were online abstract and index databases 
used in library timesharing applications to search for relevant articles 
and publications. These often used a keyword approach to locate [el e ^ r 
content. Although full -text indexing and search tools are the heart and 
soul of many of the retrieval modules, text documents are still often 
described by and linked to records with defined fields such as author, 
"date," "title." "subject" and so on. . 

Database technology in general influences the underlying search 
engines, query languages, user interfaces and data models. Most frequ|ntiy, 
structured cd-rom Sata is organized into the fields-in-records of a simple 
flat file or table. Cd-rom authoring packages may also borrow concepts, or 
at least votabulary, from relational hierarchical or object-oriented 
approaches J MediaBase uses a hierarchical outline as its organizing theme, 
for example!, while HyperWriter/HyperReader links multimedia objects. 

Fielded data and fields within text, some cd-rom authoring/ retrieval 
software hafe been explicitly optimized ^structured, fielded data ; An 
example of tuch data would be a telephone directory, where fields such as 
first name.ilast name, street address, city and phone number are obvious 
ways to entfer and store the data. Dataware's CD Author/CD Answer and Key 
Record Build/Record ReferenceBook and Knowledge Access KAware for fielded 
data are examples of packages designed for structured, relatively short, 

records^ t ^ xt>or , entec | packages also borrow structured record 
organization, utilizing defined fields as search keys for text documents or 
sections. Fielded records as part of text files or linked to text files 
allow the user to constrain a search to the Author field, for example, in a 
database of book abstracts, for example, or to the Director field in a 
database of movie descriptions. Most cd-rom retrieval packages are based on 
full -text retrieval engines. Even the "fielded" information in many or them 
is handled Ss a marked-up part of a text file, readable as such, rather 
than stored in a fixed-length format or in a variable-length record format 
with fields! separated only by a comma or other delimiter. 

There is considerable variation in the flexibility with which 
different Packages can use fields in searches, whether, for example, they 
n- - * _• — i *i j«- . a ^flarrhfle such as 

jch as 
nested 

- , . export of 

fiel ded datia to' external databases; calculations on data from the 
structured Irecords; or report definition and creation. 




Techni 



ques from print publishing. Cd-rom retrieval software 
y borrows publishing systems technology for coding typography, 
and logical document structure, as well as relating drawings 



increasingly 

page layout and logical 



3 ote7 
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and photographs to text Retrieval J^WgLg^^^^^^"' 

informa^r*^ Tne er 
hand often] incorporate font codes ana font display and P r1 """9- ™|| h 




vendors that do not currently support ™« , «■ — j "J- f.. rure 

soml have ^announced that sgml markup will be incorporated in ^re 
lutnorinrriroducts- These vendors include zyLab. ^ led 9 eSe ^^ n TWS- 
autnoring pr page-description technologies are beginning to 

becon.r a viilaDle for cd-rom- KnowledgeSet, for example has ^unced that 
it will support Postscript in a future reiease of krs As «e"? n ^ HDt 
above theSe are also page-based viewers, some of which support Pof^ript 

nnintino tfte reader to a citation in another document. Hyper nnns, 
various V^«ned? are important features i" many of the Programs. 
Hvnerl inks I mav include: " Jumping from a word or phrase to other 
occurrences $ ihat phrase in otfier locations; * Jumping to related 
material such as in a "see also" reference in an encyclopedia, 
mater I a ' 'jumping to footnotes, citations, glossaries or maps 

* Requesting expansion of a passage into a treatment in greater 
depths " Requesting a graphic or playing a multimedia clip from a 
n^nr in the text" * Jumpinq from a specific point in a graphic to o^her 
?nforma?ioS? 2f* Jumping "rWthe table of contents to a specific section 

° P Ch sSr 0 f the fullest implementations of hyperlinks can be seen in 
Ntergaid ? sfHyperWriter/Hyper Re ader. compton's Smartneve and CD Author/CD 

•^rSaSffifc and training features. Cd-rom retrieval packages can also 
hnrraw contents from presentation software and computerised training, 
leading ?hr5Sr throCgh a scripted flow sequence that branches according 
to the useps response!. Training and presentation software also Provide 
the prototypes for being able to control physical audio/visual devices such 
as videodisc players anS software multimedia viewers or players. 

Page fmages . when the cd-rom publisher starts with Panted pages and 
does not have digital files to start with, the. technologies of document 
image processing -- scanning and retrieving P™ted pages as bitmapped 
images -- tan be used, increasingly in packages such as CO Author 
H^erText^Searchexpress and Zyindex. the raster image file is linked to 
character-based versions of the same content to permit ful 1-text inde xing. 
Au&ori™ packages such as searchExpress and zylmage are now offering ocr 
modules that can process the page images into searchable text. 

Text Isearch. The key problem for cd-rom text management, and ft the 
same iime litfkey a£aniage\ is finding^what you. are looking for within a 
mountain of text. For collections of related articles and for structured 
documents.! a common technique is full-text in 2 ex ;" a > - r 

FulHtext indexing is logically similar to the famili »r 
hark-of-the-book index: it is a way for the reader to find passages or 
document?jha1con!ain specific terms.. But electronic I^W**^™ 
because o^ its capability to look up virtually any word that might appear 



PAGE 41/64 1 RCVD AT 8/2912006 3:08:31 



6 PM [Eastern Daylight Time] * SVR:USPT0-EFXRF*9 * ONIS:2738300 * CSID:1 650 326 2422 * DURATION (mm-ss):15-06 



AUG. 29. 2006 12:12PM . TTC-PA 650-326-2422 



NO. 7948 P. 42 



on the disc and because of the speed with which it can take the user 

Sickly to the located passages, which can number in the thousands on a 

cd-rom* Electronic indexing can also assist the user in evaluating the 

Jf a search "hit* prior to viewing the document- . 
relevance g a search ^ ^ a „ of each word in the 

text and where each token or instance of it may be found, often, however, 
Sis basic technique pulls up too many "hits" that "relevant t» the 
searcher's quest, and ignores too many relevant items that contain 
svnonvm£ related terms or slight variations on the search terms. To cope 
„??h thlse Icentral limitations of literal string, searching, a number of 
?lchnioueI can be used: * Providing a "stop list" of common words that 
occur too frequently to be useful in a word search. These words are simply 
not indeed sometimes the author or the user (or both) can modify the list 
of cZ words bv makinq the list larger or smaller. * Allowing the user to 
sLrch P forXhrSerand 9 individual words. "Disk drive" can be a more useful 
slfrch tlrm than "disk" or "drive" alone. - Combining search terms with 
ioolean operators (and. or xor, not). For example: Find r e ^„ e n« ah *° bul b - 
^cd'rom" orj "optical disc" or find references to "Edison and light bulb 
in the same 1 document, some programs support complex, nested Boolean 
searches while others keep it simple to avoid intimidating users. * 
Providing automatic "stemming" tha? can find related ^ d * s uch as plurals 
of nouns or past tenses of verbs, though the endings may °e different A 
search for "atomic bomb" would also pull up references to atomic bombs and 
atomic bombing. - Allowing a wildcard be specifiedjn » string search. A 
search for""optical dis*" could find "optical disc" or optical disk, 
using fuzzy logic to overcome spelling errors or multiple, spellings. , Some 
programs dn find "optical disk* as possibly relevant to optical disc 
automatically, because they are "close enough" for the hit to be brought to 
?he user's 'attention. * Allowing the user to specify other Proximity 
oarameters e.a.. look for "cd-rom" and •'authoring" within five words of 
one^thlr. * 9 Encoding the logical structure of a document or collection 
of doS2n{s via sgml and using* this (or similar) markup to allow the user 
to constrain full-text searches to specific books, chapters, sections 
Dao.es paragraphs. * Ranking the documents that contain search hits for 
possiblS r?Tevance. Many reTevance algorithms exist, some simply count the 
number of hits (e.g.. KnowledgeSet's JCRS). others consider patterns of , 
cooccurrence of terms within a search item and weight rare .words and 
phrases mo*e highly than common ones (e.g.. Personal Librarian). Some 
programs (e . g . , searchExpress) allow the user to assign weights to 
5if?erent>e?ms manually for a given search. * Looking for .related terms 
or at the very least for synonyms, is a way of going beyond literal string 
searching. ! usually c<jnsul op conjt1:rilCtin g a thesaurus (e.g. , Dataware's 
ReferenceBook) or a more elaborate hierarchical semantic network (e.g. , 
SmarTrieve) to represent how the meanings of terms interrelate. Either the 
authoring system developer, the title developer or the user (depending on 
the Implementation) — or all of the above — could instruct such a . 
retrieval System that "authoring" is related to "building" and "indexing 
and a hit In "building" should Be treated as a hit on "authoring. 

S-roj^Cthorin^and retrieval products tend to be developed Initially 
with an emphasis on structured records or text search, or hypertext, but 
over time they take on features of the other approaches. 

structured database-oriented programs get better at managing free-form 
text and develop user interfaces with alternatives to conventional query 
languages. | Text -oriented programs add data fields and the features of 
structured record retrieval, as well as logical markup * lasgml. Other 
examples of this convergent include:* RiTl-text indexing is enhanced with 
relevance-fanking algorithms. * Text files are linked to page images of 
oriqinal printed documents- Authoring/retrieval packages that began with 
plain ascii characters become able to deal with typographical features. 
Text becomes hypertext. Graphics become multimedia. Hypertext becomes 
hypermedia!. * Page-oriented models based on publishing technologies add 
search and! navigation capabilities, 
i 
l 

i 

I 

i 
i 

I 
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This convergence of retrieval features is not just the result «*J 
coinpetitiort among vendors. It also Stems from the fact that desktop systems 
keep getting more powerful, and the basic system software keeps improving 
in Its capability to present typography, graphics and multimedia. 

Data types and formats, one way to keep track of what an 
authoring/retrieval package can or cannot do is to ask, "What data types 
and specTfijc formats are supported?' The chart on pages. 6-7 shows the data 
types and formats each package can import and retrieve in some manner. Note 
Sat importing data and converting it to the authoring system s Preferred 
format may lor may not strip it of some of its richness, whether the data is 
in the form of text or graphics. Note, too. that some r ftrieval software ? 
can make use of external viewer or player programs, called by (or spawned 
or "launched" from) the retrieval program but separate from it. This is a 
flexible approach to handling graphics and multimedia files, even text 
files, but]there may be a penalty to be paid in performance, and there may 
be problems with pricing and licensing. _ 

There lis sometimes a need to restrict access to data on a cd-rom for 
security reasons or to protect copyrighted material, for example, various 
oackaoes support different techniques: data encryption, password 
protection J output limitations and the ability to lock and unlock specific 
content on the disc. This last feature can be used by mailing nstor 
software publishers, for example, who can distribute millions of names or 
hundreds of programs at: once bur who can sell access to a limited subset as 

* Pa "latf2rms^%hrhardware constraints in the target delivery systems can 
be an important factor in selecting a retrieval package. In general , the 
ctos-only retrieval software will run adequately on '2§6-class systems with 
640k of ram, while windows packages will assume a Windows-capable system in 
the '386 or '486 class with probably 4 mb of memory. 

Some retrieval programs, such as Romware from Nimbus, have been 
specifically developed to run without workspace on a magnetic drive, while 
others assume that users will have several megabytes free on their hard 
disks. Many packages are flexible in this respect, benefiting in 
performance from at least a megabyte or so of space on the hard disk but 
also capable of being run from the cd-rom disc alone or with workspace 
available 6n a floppy. _ . 

The chart on page 11 shows the software environments that are 
supported by the packages that we are considering. 

Looking at the build tools 

As noied above, the authoring tools themselves often get less 
consideration than the features directly experienced by end users through 
the retrieval module. Nevertheless, it is the build topis that the 
publisher spends the most time with, and some vendors have gone to greater 
lengths than others to make life easier for the preparation staff, some of 
the features publishers look at are: * The development platforms supported, 
and the hardware and system software requirements. For example: How much 
disk space 'is needed on the development or authoring system hardware 
relative to the size of the data files that will be indexed? usually 2-5 
times as mdich disk space is required. Can authoring be done on a lan? is 
there support for cross-platform development? * Style of the user interface 
for authoring, is there one program or a collection of separate modules 
that must be run independently? * The kinds and formats of data a package 




the 1 to 50 megabytes per hour range, averaging between 10 and 20. 
according to the vendors. Of course, this varies not only with the package 
but also with the data and the design. • What kinds of file-compression 
techniques I are used? What is the maximum size of the source files that can 
included om a cd-rom? * What is the overhead generated by the indexing 
process, a$ a fraction of the file size of the original files? Par is 
something like 35% for text files. A high ratio of index to data on the 
disc itself may, however, only reflect the fact that the data files are 
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efficiently! compressed. A low ratio may reflect a ^simple indexing 
scheme. For fielded data it 1s possible that the original files and the 
index files i together are smaller than the original data £L*Jl*r 
compression: * How can the user interface be customized for a P^ticular 
publisher or title? Before going the custom programming route it 
sense to loik for off-the-shelf authoring software that lets the publisher 
customize the search program without writing new code, some of these 
methods for i customizing the user interface include removing °P^ on ^™V 
the menu, changing the command structure, and redefining function keys or 

menU lome e vendors permit modifications to source code others do not. Many 
of the authoring packages that we list and that do make available a 
standard user interface also provide an optional api (application 
programming! interface) with a toolkit for linking a custom-developed user 
interface ti the retrieval engine. Authoring systems with an optional api 

version^ i duct include smarTrieve, Mediasase, CD Author/CD Answer, 
Romware, search Express, krs. Re : Search, pls Personal Librarian ana 
TextWare. * Premastering tools are sometimes available from the autnoring 
vendor as an option, as is the case with crowninshield, Knowledge Access or 
Dataware. It is useful to know which output devices the pretnasterinq 
software supports (dat, 6mm tape, 9-track tape, worm etc) and whether 
recordable cd is one or them. 

There's considerable diversity in the pricing approaches of the 
cd-rom authoring and retrieval software vendors, and in the total Prices, 
fees and royalties to the developer of using a particular set of authoring 
tools. Pricing of cd-rom authoring/ retrieval tools is based on one or more 
of the following elements: 

* The purchase price of the authoring tools; m m 

* The pricing schedule of per-title (first title, second title... 
unlimited titles) licensing fees for use of the authoring software; 

* The price of the initial retrieval module; 

* Thfe pricing schedule of per-unit royalties for the runtime 
retrieval software, which can be set as a function of quantity of 

units (number of discs), the sale price, etc; u There may also be 
maintenance fees (beginning the first or second year) for the publisher 
and, in some cases, tor the buyers of the cd-rom products* m 

The litensinq fees and royalties for using authoring and retrieval 
software can be a few hundred dollars for an unlimited number of titles and 
units, or they can be in the tens or hundreds of thousands of dol lars. 

we've Summarized the licensing practices of each of the vendors in the 
chart on page 9. 

Services 

Along with software tools, many authoring system vendors also will 
provide assistance with the steps in the authoring process, including title 
design, data capture, file conversion, indexing, interface customization 
and programming. They will also often help with premastering and disc 
replication!, if desired. Support can be vital in producing a cd-rom title. 
Find out whether the support you need is available from the vendor, and at 



what price. 

Resources 

in doing your research, you will want to check the vendor's history 
and experiehce, including a peek at titles that have been published with 
the packaged in which you are interested. The SIGCAT software Showcase 1992 
(in iso 9660 format for dos) is still a useful sample cd-rom di sc . 
containing the retrieval modules from ten commercially available cd-rom 
database products mated with the identical database. 

It is available from Dr. Ash Pahwa, cd-rom Strategies, inc., 18 
Chenile, Irvine, ca 92724; phone (714) 733-337B. fax (714) 766-1401. 
Enclose a slelf -addressed envelope with postage, both of which accommodate a 
cd. Not alllof the packages, unfortunately, are fully functional as 
transferred! to this disc, due to a security feature in some of the software 
that detected (innocently) modified date stamps on the files. 



PAGE 44/64 ' RCVD AT 8/29/2006 3:08:36 PM [Eastern Daylight Time] 1 SVR:USPTO-EFXRF-5/9 * DNIS:2738300 * CSIDll 650 326 2422 1 DURATION (mm-ss):15-06 



AUG. 29. 2006 1 2:13PM TTC-PA 650-326-2422 NO. 7948 P. 45 



Another reasonably current resource (in printed form) is the Apple 
Cd-rom Handbook: A Guide to Planning, Creating and Producing a cd-rom. 
Written by £pple computer, it was published by Addison-wesley in. July 1992 
($14.95). (bf course this book is Mac/HFS oriented, rather than l so -9660, 
but it is sfcill a useful reference.) 

Consider also the following publications, in addition to those 
mentioned ih our August article: * Mecklers newsletters, including, 
Multi media/tD publisher and • ~ , n- 

Cd-romi Librarian, Meckler Publishing. 11 Ferry Lane West, Westport CT 
0688O; phonfe (203) 226-6967, fax (800) 858-3144. (Personal subscriptions 
and K-12 school libraries get lower rates than other institutional 
subscribers") * Cd-rom ColTection Builder's Toolkit: 1992 £39-95 Toolkit 

92/11 tannery Lane. Weston, CT 06883; phone (800) 248-8466. fax (203) 
222-0122. * The Cd-rom Directory 1992 (860-paqe book, $149) or Cd-rom 

Directory on DISC (on cd-rom, $199). Published by TFPL (London) and 
distributed by um'Disc. 3941 Cherryvale Ave., Suite 1, soquel r CA 95073; 
phone (408)1 464-0707, fax (408) 464-0187. 

Compton's NenWeAia 2320 camino Vida Roble Carlsbad. CA 92009 Phone: 
C619) 929-21500 Fax: (619) 929-2555 

crownipshield software 29 Crafts St.. Suite 200 Newton, MA 02160 
Phone: (617) 965-3383 Fax: (617) 965-1966 

Dataware Technologies 222 Third St., Suite 3300 Cambridge. MA 02142 
Phone: (617» 621-0820 Fax: (617) 621-0307 . 

Executive Technologies 2120 16th Ave., south Birmingham, al 35205 
Phone: (205) 933-5494 Fax: (205) 930-5509 

Folio 12155 North Freedom Blvd. Suite 150 Provo, Utah 64604 Phone: 
(801) 375-31700 Fax: (801) 374-5753 

Knowledge Access Int'l 2685 Marine way. Suite 1305 Mountain view, CA 
94043 Phoned (415) 969-0606 Fax: (41S) 964-2027 

KnowledgeSet 888 Villa St., Suite 410 Mountain View, CA 94041 Phone: 
(415) 968-9888 Fax: (415) 968-9962 mj 

Microrietrieval one Kendall Square Building 300 Cambridge, MA 02139 
Phone: (617) 577-1574 Fax: (617) 577-9517 

Nimbus! information systems PO box 7427 Charlottesville. VA 22906 
Phone: (804) 985-1100 Fax: (804) 985-4625 . _ _ . 

Ntergaid 2490 Black Rock Turnpike Suite 337 Fairfield, CT 06430 Phone: 
(203) 380-li280 Fax: (203) 380-1465 

Online' computer systems 20251 Century Blvd. Germantown, md 20874 
Phone: (303|) 601-2190 Fax: (301) 428-2903 . , M 

personal Library software 2400 Research Blvd. Suite 350 Rockville, MD 
20850 Phone: (301) 990-1155 Fax: (301) 963-9738^ 

TextwaVe po Box 3267 Park City, UT 84060 Phone: (801) 645-9600 Fax: 
(801) 645-9(610 

TMS 110 west Third st. Stillwater. OK 74076 Phone: (405) 377-0880 Fax: 
(405) 377-Q4S2 

voyage' r 1351 Pacific Coast Highway Santa Monica, CA 90401 Phone: (310) 
451-1383 Fax: (310) 394-2156 „ „ . . ^ M 

ZyLab Division of idi 100 Lexington Dr. Buffalo Grove, Illinois 60089 
Phone: (800) 544-6339, (708) 459-8000 Fax: (708) 459-6054 
[TABULAR DATA OMITTED] 

comptc n's NewMedia: SmarTrieve 

The encyclopedia Britannica is not yet on cd-rom, but Britannica s 
corporate sibling. Comp ton's NewMedia is one of the leaders in cd-rom 
reference book publishing, boasting such titles as compton's MultiMedia 
Encyclopedia and the Guinness Disc of Records. SmarTrieve consists of the 
software tools used to produce such popular and pioneering titles. 
Compton's NewMedia is making these authoring and retrieval modules 
available to other publishers seeking to repurpose print material and add 
multimedia lei ements. ESC, Jos tens, DMG, EB, Context, USA Today, Dialog, 
Mediashare,! Ingram and B. Dalton have already taken advantage of this 
opportunity. 

SmarTrieve offers easy-to-use interfaces for the three common desktop 
platforms ddos, windows and the Mac) and for the Sony portable cd-rom xa 
player (also known as the "mmcd" or "Bookman"). These front ends are 



PAGE 45/64 * RCVD AT 8/29/2006 3:08:36 PM [Eastern Daylight Time] * SVR:USPTO-EFXRF-5/9 * DNIS:2738300 * CSID:1 650 326 2422 * DURATION (mm-ss):15-06 



AUG. 29. 2006 1 2:1 3 



PM ' TTC-PA 650-326-2422 



NO. 7948 P. 46 



unusual in tW they accept natural language queries such as Why is the 
sky blue?" 3r "do fish in the ocean sleep?* More evidently than with much 
of the competition, the user interfaces have been designed for the general 
public ratheY than the trained database searcher or experienced computer 

user ~ are implemented automatically, 




text entries!, graphics, animation. - - . coarrMnn 

Idea, concept searching. The full-text index permits word searching, 
but the software also supports what Compton's claims is sophisticated idea 
or concept parching. hrtU _ OTr 
when SmarTrieve accepts a natural language query it J,*^**™^* 
really parsing the English entry- it does not "understand the question at 
all ! When you ask smanrieve "why is the sky blue?" SmarTrieve has no way 
to extract the linguistic structure of the question or to evaluate its 
"Se^inS?" Thl software is oblivious, in otHer words, to natural language 
syntax and, [at this point in the. process, to semantics as well . What 
SmarTrieve a\>es, however, is to process this query in roughly the following 

Way ' i. it drops the stop words and concentrates on words that occur with 
lower frequency in the full-text index. In this example, it skips Why is 
the" and looks at "sky blue." , , . . , 

2- it looks up "sky blue" in a concept dictionary, a hierarchical 
semantic network that maps related terms: synonyms, subcl asses t 
superclasses, etc. This network has been derived from the massive 
eritannica reference work files, which include encyclopedias, dictionaries, 
thesauri and other sources. ^ „ . 4 , . 

Phras? support in the dictionary will find that sky blue is a 
phrase in English, and ask the user whether the question is about the 
entity "skylblue." upon user response that this phrase is a red herring and 
not really of interest, the concept dictionary is consulted to find terms 
related to Ysky" and to "blue" separately- , u 

3, It finds documents containing "sky" or "blue" and/or any terms the 
concept dictionary defines as related to these ("atmosphere and azure 

perhaos)^ ^ consulted, so that a child's "Why is 

the sky bloi?" would also be properly processed. 

5. SmarTrieve then flags the encyclopedia articles that contain 
references to sky, blue or related concepts, from the full -text inverted 
index. The index also is used to calculate in which articles the two 
concepts are both found (co-occurrence), how often they co-occur, now close 
to one another they co-occur (proximity) and just the frequency of 
occurrence of each term individually in each flagged article. The 
algorithms chugging away to weight all of these factors also take into 
accounr whether the key terms occur in the body of the text or in the title 
of an article, for example, A weighting is assigned to each article. 

6, Thel articles are sorted in rank order of the relevance weighting, 
the titles Are listed, and the user is assisted in finding out whether an 
article is really relevant by looking at the context in which the possibly 
relevant teho or terms occur in the article. _ 

Ttie purpose of all of this computational effort is at least threefold; 
to find the documents that a simple full -text Boolean search (find sky and 
blue) would miss, to list the (potentially) most relevant articles first in 
order to save the user's search time, and to make the user interface much 
f ri endl i e r . 

The retrieval engine has been designed to automate sophisticated 
searching rather than to achieve speed records. The indexing tools are not 
advanced as I speed demons, either. . 

The software is designed for text, linked to graphics and multimedia 
objects. Where the retrieval platform will cooperate, font information and 
character attributes can be utilized, but the retrieval of text is not page 
oriented. Graphics and text can be shown together on the screen or viewed 



i 
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separately, lusers can copy material from the disc into the clipboard for 
printing or incorporation into other applications. . 

smartdrr, the standard retrieval software, can search multiple 
databases across devices and over a local area network. It can also operate 
as a tsr under dos or as a separate Windows task, making online reference 
material accessible by the computer user via a hotkey without the need to 
exit anotheif application. 

For developers who require a custom user interface, a set or 
C-callable Subroutines, smart-API. is available. 

An enhancement to SmarTrieve planned for introduction this month will 
add a Virtual workspace capability. The virtual workspace , similar in 
concept to Xerox Rooms, enables numerous articles or documents to remain 
open simultaneously without being limited to the desktop space on a Windows 
screen- Theiuser can move in and out of SmartTrieve and return immediately 
to a stored Isetup at a later date. .... ... 

r. r. Donnel ley's Database Technology services Division is providing 
SmarTrieve authoring services for compton's NewMedia and for other 
publishers not wishing to bring indexing and data conversion m-nouse yet. 
Donnelley's Idts group will also provide data-preparation services using a 
number of ttjie other authoring tools mentioned in this article- dts claims 
to be "like Switzerland," neutral in the battles among the authoring 
software giants. Compton's, headquartered near San Diego, also distributes 
third-party jcd-rom and floppy-based titles as well as its own works. 

Summary- smarTrieve's strengths are its easy-to-use interface, 
including the capability to accept and process (pux not parse) queries in 
natural language , its sophisticated relevance ranking algorithms based on 
both statistical and semantic approaches, its capability to link text with 
graphics and multimedia objects for cd-rom reference titles and catalogs, 
and its support for Mac, dos, Windows and Sony MMCD retrieval, a weakness 
may be retrieval speed, due to the search for conceptually related material 
rather than [simple string searching. 

crowninshield: MediaBase 

Crownirhshield, located in Newton, MA, is one of the smallest authoring 
software vendors, but more than 50 different cd-rom titles have been 
published with its software, MediaBase, which has been in use since 1986. 
Some commercial publishing applications Include The Hutchinson 
Encyclopedia, The Colorado Revised statutes, ArtFacts, The Aircraft 
Encyclopedia, Vietnam Remembered, The Plant Doctor, The Baseball Register 
and The Massachusetts Administrative Law Library, corporate and academic 
customers include Ocean spray cranberries, Raytheon, The social Law 
Library, university of Colorado, University of Iowa and cd-rom Resource 
Group. 

Different flavors, crowninshield 's MediaBase's full-text retrieval 
package organizes text records and associated color or monochrome graphics 
and multimedia files into an outline-structured database, with categories 
and subcategories. Records can have specific fields containing numeric or 
alphanumeric information contained within them. The outline is an organizer 
for both the author and the user. The logical structure of a document or 
collection of documents can be mapped into the MediaBase outline, and the 
user can constrain searches to a particular level of the outline. The 
rypical title is directed at a general interest audience, pulling together 
a variety of information on a given topic. 

MediaBase offers three standard full-text retrieval interfaces: 

1. MediaBase Light for dos, a function-key driven front end for novice 
users with the stress on simplicity; 

2. MediaBase Runtime, also for dos, with pulldown menus, Boolean 
searches, cross-category searches, notetaking and database extraction 
tools, record marking, recording of search "trails"; and 

3. MediaBase Windows, a Windows-based product that features a 
graphical user interface and presents outline and record headings as well 
as a scrolling word wheel . Fielded Boolean searches can be initiated via 
mouse click?, and hyperlinks cross datafiles or categories, pointing to 
record headings. A mouse click orders up the display of the desired related 
text record or image. Windows discs can access cd-audio on mixed mode 



I 
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discs, as well as wav audio files. . . . M _ , 

For additional customization, a Developer's Package adds an api 
function library to the basic Publisher's Package These subroutines can be 
called by a custom user interface, programmed by the developer or by 

Cr ° W indei?lg/retrieval, Authoring consists of designing the outline 
structure and Placing field and record delimiters within a stream of text. 
Srop-down minus in tfie Editor or Outline Utility program guide the author 
in importing and then indexing the data. After importing and ind f*™§' 
is possible'to rearrange the outline and the field orders, as well as to 

about 20 mb per hour ?with a 15-ms average access hard, disk) and an index 
overhead of 30 to 40* of datafile size During actual indexing you ^J 1 
need four t4mes the size of the import file available on the hard disk for 



^ crowninshield claims that most searches on a MediaBase cd-rom title 
can be complleted within a second or two, even on low-end machines. 

All of these figures are so tyoical of the industry's claims that we 
will mention competitive figures only when they deviate significantly from 



premastering. Crowninshield also supplies premastering software, 
called the gj-Formatter. which, like other premasterina Programs, can 
output iso &660 formatted image files to 8mm tape digital *£™**pe, 
9-track tape, or recordable cds and other removable media. CD- Formatter is 
bundled with the publisher's Package. . 

Summary. There are people who organize anything into a hierarchical 
outline. fr£m a shopping list to a lecture, and there are those who find 
outlines obnoxious. MediaBase is a tool for the outliners. It turns out 
that this structure can be congenial to the development of a wide variety 
of -titles, particularly reference materials. 

oatawalre: cd Author, Reference set 

DatawaVe Technologies claims to be "the largest and most complere 
independent: supplier of software and services to the cd-rom a ™ muj*™**"* 
industry." lemploying 90 professionals worldwide. Founded in 1988, Dataware 
merged with Reference Technology earlier this year. The new company 
retained the Dataware Technologies name and Cambridge MA, hea d 2 ua ^ers. 
prior to th'e merger, Dataware had been known for its fielded-data authoring 
products, Which had done well in the commercial publishing * ar * e *i 
Reference Technologies was known for its full-text products and its success 
in the government and corporate cd-rom data preparation niches. 

Dataware cd-rom authoring software has been used to prepare more than 
300 titles jincluding: * National and regional postal and telephone 
directories, including discs by American Business Lists, the British Post 
Office. Deiitsche Postreklame (Germany), N ynex Information Technologies , 
ProCD, ftead Only Memory (Australia), the Swedish post Office and U.S. west 
Communications. Dataware itself, together with speedoial .publishes 
American Business information, a Yellow Pages directory of the U.S. 
in-house corporate products for companies such as A.M. Best, 

Eastman Kodak, Dun & Bradstreet, Siemens and 3m. * standards and 
patent information for AFNOR (France), DIN (Germany^ the British standards 
institute, ]the European Patent office, Research Publications, the Spanish 
Patent Office, the U.S. Patent office and wTLA verlag (Germany). * 
Catalogs, ilndexes and reference titles for universities in several 
countries and for clients such as Baker & Taylor Books, the Association for 
computing Machinery. Datapro. IDG. National information Services, R.L. Polk 
and Thomson Financial Publishing. * Company information discs and business 
directories! for trw ^ t „ 

Business Credit and others. * Technical documentation and parts 
catalogs fdr R.R . Donnelley, , . _ . . . 

Ford New Holland. NEC (USA) and others. * Legal information cd-roms by 
publishers lin Germany. Spain and . 

Belgium. * Discs containing newspapers and periodicals by Newsbank. 

Softli|ne information, inc., and several Canadian publishers. 
| 

! w 

i 

i 
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Government information from Canadian, outch, Italian, French and Swedish 
aqencies. plus, among other American agencies, the U-S Geological survey, 
^eu-i information Agency and the U.S. Navy. • Time-series databases on 
cd-rom n from^ woMd ^ Quidc Source ^ the Canadian 

Nationalist tistical^Agency^^ ^.^ Reference Technology. Dataware now 
offers fourfauthoring packages. The former Reference Technology products 
are denoted the ReferenceSet product family, which includes the 
text-oriented Full Text euild/Text ReferenceBook and the structured 
data-oMentfed Key Record Build/Record ReferenceBook. The two original 
Dataware products are in the CD Author family: the structured/fielded 
data-oriented CD Author/CD Answer and the text-oriented CD Author 
HyperText/cb Answer HyperText. There seems to be considerable overlapping 
in functionality between the two text-oriented packages and between the two 
structured-record packaqes . 

Dataware presents itself as a complete service bureau and a software 
vendor, capable of handling the software customization data preparation, 
and premast£ring if the client so desires. Dataware also sells a 
premasterinb module called CD-Prepare, 

CD Author /CD Answer - . 

CD Author build software and CD Answer retrieval software are used for 
structured/ fi el ded-data titles, such as indexes, catalogs, directories, 
patents, bibliographies, corporate customer data, statements, parts 
inventories! and government accounting data. . . . eM 

CD Author/CD Answer is known for searching large fielded databases 
(several hundred megabytes on one disc) rapidly and effectively. It can 
process 16 million records of S12k bytes each, each record containing up to 
2,047 fields. e . „ . A 

CD Author/CD Answer also has advantages in the following areas. 
Foreign lanbuage support (user interfaces for 13 European languages plus 
Japanese KafaiJ; * Cross-platform authoring and retrieval, with the same 
cd-rom disc (containing multiple user interfaces but only one set of data 
and indexes]) usable on dos, windows, Japanese dos/v< Mac and Unix 
environment^. The same database, but not the same disc, can also be used on 

cd-i players. * Data compression, with the finished application, 
including the indexes, often requiring less space than the uncompressed 



source dat 

* Encryption; . 

* Advanced full -text search capabilities such as adjacency and 
proximity; * Flexibility of indexing (for example: phonetic 

,? sounds-lik!e" index: many different possible formats for date fields); 

* output options (for example, export records in dBase format); 

* Capability to link raster page images or pictures to structured 
records; and * Multimedia files linked to fielded data and played by 

launching external programs. ^ , . 

CD Autjhor/CD Answer text data is full -text indexed, as text, with all 
words excerit stop words, or by line, with each field indexed as a unit. The 
user may jump to related records by cross reference search. Boolean 
searches wijthin and across fields, range constraints, various record views, 
and graphic! zoom, pan, scale, print are supported. 

Dataware provides interactive debugging of data conversion, permitting 
the author jto step through the data reading pass a field or record at a 
time. I t t _ 

Dataware provides a wide range of customization 

option's for CD Author/CD Answer, some of than available to those who 
use the standard CD Answer user interface, and others involving an api 
approach, available in Dataware's Advanced Design Library, or modification 
of the CD Alnswer code. _ 

sumraariy. This is a package for fielded data, with a high-performance 
database enlgine, advanced support for foreign languages, cross T plattorm 
development?, multiple data types, data compression and encryption. 

CD Author HyperText and CD Answer HyperText 

CD Autjhor HyperText (build software) and CD Answer HyperText 
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f retrieval Software) is Dataware's complementary product designed for 
frS-wxt applications, such as technical manuals, and "awards t*at can 
be defined S a collection of documents: legal/tax information, J~seleaf 
publication*, corporate contracts, government procedures, regulations and 

° ther Thir e o?Swalre r 5acka g e can use sgml tagging In a source file to create 
a logical structure within the cd-rom. The logical structure aids 

^aV19 | e i;^h£^•„ e c^ C Sn;Ser HyperText do not attempt fancy concept mapping 
or complex felevance ranking, except f or counting the ™^ of Jits J- on 
each logical level. Searching and navigating in CD Answer .^yP^X^LrhoL 
elegant electronic elaboration and cognation of the ^^^^Sibil^Sle 
of locating; information in a book, such as the expandable/collapsible table 
of contents representing logical structure, the Cf" 11 "^) 1 ?^^?^ 
pre-embedde'd hyperlinks that automate navigating to and from footnotes, 
citations, glossaries, illustrations and cross-references coarrh 

as in its CD Answer fielded-data product. Dataware stresses search 
soeed and data capacity (up to 16 million sentences) as one of its strong 
Points as well % graphics display and printing, links to multimedia 
elements and support for 13 European languages. 

in addition to text search and navigation, CD Answer HyperText 
Includes full-data searching capabilities as described in the previously 

"^Snlikf the^fielded data product CO Author/Answer ^perText is only 
available as a dos program. Fonts and character attributes are not stored, 
but color Jay be used in the retrieval user interface. The original pages 
can be scaled and viewed as raster files attached to text files Grapfiics 
are displayed and printed separately from text, not combined on the screen, 
except in the case of the bitmapped scanned pages lust mentioned. 

customization of the CD Answer Hyper-Text application occurs as the 
author specifies which fields are to be displayed on which screens. The 
command structure, function-key mapping etc. are not modifiable, 
^^summacy. use of sgml tagging to represent logical structure of text 
documents: used for collection of documents, technical manuals, legal 

codes. ^ Build and Text ReferenceBook 

Dataware ReferenceSet Full Text Build can author text -oriented titles 
on a dos pC that are usable either on dos pes or on Macintoshes. T e« 
ReferenceBook is the retrieval software. Dataware positions these products 
as useful for technical manuals, encyclopedias, legal and financial 
oubli cations cataloqs, policy and procedure manuals. 

P Full Text Build can index 16 million documents in a collection, each 
with a maximum size of 16 megabytes. An unlimited number of collections may 
be indexedland collections may span cd T rom disc volumes. . , . 

still images can be incorporated into Text ReferenceBook titles by 
using the optional imageBuild and ImageDisplay modules which permit 
capture, compression, indexing, storing, retrieving and displaying of 

images.^ 0ataware fu11 _ text st:a blemate, CD Author/Answer HyperText, 
the Full Text Build/text ReferenceBook combination can accept and process 
sgml-tagged documents. Three basic navigation techniques are supported: 
browsina throuqh the table of contents and hyperlinks; text searching; ana, 
in the dos I environment, bookmark navigation. The Text ReferenceBook "ser 
interface gives the user fill-in templates for constructing search queries, 
and data displays for showing search results. The user may directly browse 
in the table of contents or document hierarchy built from the sgml markup, 
expanding Lid contracting the table fully, down to the level of the actual 
text content in each section indicated by the toe outline. . 

A word wheel of Indexed terms displays all the unique words in the 
document collection, with frequency counts, I.e. number of documents and 
number of occurrences. A thesaurus permits expanding a search to include 
synonyms and suffixes. Bookmarks can be placed anywhere, and documents or 
sections they correspond to may be selected later on. 

The author can select to configure the dos retrieval interface with 
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cither pulldown menus, function key operation, or "hot spots " The author 
can also modify the control flow, screen contents, function-kev operation 
and text disjplay. Menus, for example are modifiable in the configuration 
editor. An api is not available. L . . , c 

The Tekt configure tool set can be used to design dos windows, 
buttons, color attributes and so forth. Macintosh environments are 
configured by modifying resource files with the Macintosh utilities ResEdit 

and ViewEdit- . ^ , 

The authoring process includes scanning source data for sgnil tags, 
full -text indexing, table of contents building, and index builds for 
hyperlinks to images and other content. Inconsistencies in the input me 
sgml tagging are flagged for revision. 

summar?. Full Text Build/Text ReferenceBook forms a powerful dos/Mac 
full -text retrieval package that can use logical document structure, 
hyperlinks Jand a thesaurus for navigation and search. The capacity and 
performance of this full-text package, complements Dataware s 
high-perfontance CD Author/CD Answer package for structured data, and was 
one of the main inducements for Dataware to bring Reference Technology 
under its roof. 

Key Record Build, Record ReferenceSook 

The Record ReferenceBOok and its build module. Key Record Build, are 
designed for fielded data and are suited to applications such as 
directories! catalogs and bibliographic indexes, much as the cd Answer/CD 

Author^packMe^^ combination, however, is strictly a dos product, unlike 
the wide multiple-platform support of CD Author/Answer. Record 
ReferenceBook has been designed to cope with very large databases; up to 2 
billion records with up to 32,000 fields per record, each of which can be 
indexed (gi^en enough time and disk space), images can be linked, as in the 
Full Text/Text ReferenceBook product described above. m 

Building a cd-rom title with Key Record Build includes defining the 
data structure, revising the files for consistency, indexing and optionally 
compressing I and encrypting the files. With compression, the data files plus 
indexes can I take up less disk space than the original source data, 

Executive Technologies: search Express 




for collections of documents and related raster images. The documents can 
also be linked to multimedia files, but the audio player software or video 
viewer software would be external to Search Express. 

Search I Express was developed specifically for slow optical disc drives 
and has been refined since 1984. More than 1,000 companies have bought 
search Express, using it on all kinds of media. More than 20 cd-rom titles 
have been pressed, by organizations such as the Environmental protection 

Agency, the! U.S. Government Printing Office, Darby Printing, 
Ingersoll-R^nd, Wall Street Transcript, university of Alaska, Borden and 
A.R.E. I . 

Relevance ranking. Search Express has its own version of relevance 
ranking. The central mission of the search Express retrieval module 1s to 
find documents containing one or more of the multiple terms that a user 
believes might be relevant to a search, and to rank the documents in a 
probable order of relevance. To initiate this process, the user manually 
assigns a numerical weight to each search term in a list. The software 
finds all documents containing any of the terms and ranks them by first 
multiplying! the word weight assigned to each word by the number of 
occurrences! of that search term in a document. Then these products are 
summed to generate a document weight, which can then be ranked against 
other documents. m _ 

when the ranking is complete, the user is invited to select a likely 
document header from the ranked list and zoom down into the document, then 
the paragraph and then the sentence that contains one or more search terms. 
The found words are highlighted by the program. 

Boolean searches, with proximity parameters, can also be performed. 
Automatic wprd rooting looks for variants of a word: plurals, past tense. 
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Text documents can contain any of the 256 ascn characters but font 
and character attribute information is stripped. Fields may be defined 
containing Alphanumeric, date or numeric data. Range searches may be 
executed onithe numeric and date fields. . . • . _ _ 

Users Can print and export data, but the author can limit these 
features, encryption is an option. 

Graphics/links. Line-level hypertext can link a line in the text and 
a araohic: Ysee Figure A" can trigger the image. Figure A. Ad hoc hypertext 
searches can also Be performed: The user highlights a word or phrase i n a 
document, search Express finds and lists the headers of other documents 
with that key term in them. The user can then view these related 
documents aSd escape back to the original document when finished. 

A multilevel table of contents can be supplied to ass.ist the user in 
navigating through a structured collection of documents, such as a manual . 

Non-English retrieval interfaces and other modifications of the search 
Express retrieval programs can be made by writing custom code with cai is to 

t 6 S searchiexpess^uild modules can automatically index document files, 
extracting the title from the top of a document , for example. Search 
Express has' its own markup language that can be employed to convert input 
files to search Eicpress format. 

Folio; jviews. VIP, Previews . _ , 

Folio corporation has two dos-only, cd-rom free-form text-oriented 
authoring packages, Folio views version 2.1 for non-commercial projects 
(with Folio i personal Edition or Folio Runtime retrieval software) and 
version 2.5|for commercial (for-profit) publishing, mated with Folio VIP 
Cviews infobase Personal izer) or Folio Previews retrieval software. 

This miimth. Folio announced version 3-0 of its views software. The new 
version will be available for Windows, dos, and, eventually Macintoshes. 
Our profile here is based on Version 2.1. a news story on the forthcoming 
release is in the Latest Word section at the back of this issue. 

Folio calls collections of free-form documents infobases- it sees 
the task of | its views software as managing infobases: searching, grouping, 
linking, editing, annotating and printing information from collections or 
electronically stored documents. 

Folio, I founded in 1986, is privately held and based in Provo, Utah. 
Folio saw its market as "personal electronic publishing* software for 
corporate of- in-house electronic publishing, 'bridging the gap between 
desktop pap^r-based publishing and commercial electronic publishing via 
on-line services or 

cd-rom L 

Folio Views was introduced to the market in 1989. The line between 
in-house and commercial applications became blurred as Folio pursued a 
strategy of [licensing its retrieval software to the largest possible 
installed base. Folio claims that its views retrieval software for 
accessing its infobase format is now licensed to be used by more than 20 
million people. This wide availability of the Folio reader made authoring 
in the View? infobase format an attractive option for commercial infobase 
titles as well as in-house document databases. Folio tools have been used 
in vertical niches such as legal, accounting, insurance and government. 
Folio Views 3.0, released in 1990, added cd-rom support, hypertext 



linking, mu 



tiple database access and links to other applications. 



Customers. Folio "infobase" electronic publishing software gained wide 
exposure by being bundled with every copy of Novell's Netware lan operating 
system as the retrieval software for the Netware Help utility. Novell 
subsequently published the Network Support Encyclopedia, a collection of 
Netware documentation on cd-rom, using Folio software. The Novell Netware 
Help version of Folio's software can also be used to retrieve data from 
infobases other than Novell's online documentation. These infobases can be 
created witfi Folio views, which is not part of Novell's retrieve-only Folio 
software ■. ' 

(For tlfiose interested in riding on whatever Novell does, Novell 
expects to move next year away from Folio to an sgml-based viewer for its 
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documentation library.) t t _ , . . 

Folio software is used by Chase Manhattan Bank, union Carbide, the 
U.S. Army and the Internal Revenue Service. Publishers using Folio software 
for electronic publishing (on magnetic media and cd-roiQ include 
Prentice-Hall, Thomson international, the American institute of certified 
public Accountants (aicpa), the Financial Accounting Standards Board, and 
Mead Data Central (The Michie Company). A recent Folio list shows 64 
different infobase publishers using Folio software. Commercial publishers 
are served by Folio's Publisher Division, set up last year. 

Building an infobase. The building blocks of a Folio infobase are 
paragraph-size blocks of text called folios, searches find folios, and 
links connect folios, whether within one infobase or between infobases, 
even across i networks. Boolean searching and proximity parameters aid 
finding relevant folios, but there is no form of concept searching or 
relevance ranking. Fielded data is not supported. 

Authoring in views follows these steps: m 

1. Filter input files into Folio Flat File formatting and coding, 
stripping or translating original control codes. 

2. segment the text into folios, logical blocks. 

3. Ado a header to each folio. 

4. Group folios into topics and hierarchies. A folio can be a 
member! of more than one group - 

5. Link related sections of information, as in cross-references, 
footnotes, tables of contents or external programs. 

6. Create the infobase from the Folio Flat File. 

There are numerous software tools in the Views package to accomplish 
aspects of these procedures and provide quality control. 

Folio Views supports all extended ascii characters and preserves bold, 
underline, bold-underline, center, justification, tabs and column codes. 
Text is notldisplayed as pages and graphics are not viewable simultaneously 
with text. , . , 

Except for a pcx viewer and an rs (RealSound) audio player contained 
within Views retrieval software, views displays^plays nontext data by 
launching external dos programs via "hot links.' 

Retrieval - The Folio views retrieval engine is the same for magnetic 
and cd-rom Applications, but it has been optimized to minimize seek 
operations on the part of the drive mechanism in either case. 

Folio Views can significantly compress text files. Folio, .in fact, 
claims to be able to reduce the text and index of an infobase into a single 
file that is half the size of the original datafile size alone. This 
negative overhead has been tagged "Underhead Technology" by Rjlio s 
marketing folks. Because of this extremely efficient compression. Folio 
does not bojcher to create a stop list of high-frequency words that are 
usually dropped from full -text indexes. 

The user interface is Folio's own windowing environment. To search, 
users select words from a word wheel window, construct a query in the query 
window, andi see the resulting number of hits in the results window change 
interactively as the query is refined. 

The interface is not customizable, beyond such matters as controlling 
the size and position of windows and being able to select the color scheme, 
but versions of the product are available in German, French and English. 

Summary. Folio's approach to text data has the limitation of being 
based on smkll blocks of text and of lacking state-of-the art support for 
typography, I relevance ranking, concept searching or document structure 
tags. The data model and user interface have worked, however, for numerous 
titles thati can conform to the Folio constraints. 

Knowledge Access: KAware , 

Knowledge Access international is a cd-rom and electronic database 
publisher that decided to make its own publishing tools, known as KAware, 
available to the industry in the late 1980s. This was important, 
historically, because the KAware build packages were among the first to be 
offered at aggressively low-purchase or first-title prices. KAware moved 
the price of entry at least one and in some cases even two decimal places 
to the left], but of the mainframe range into the domain of other desktop 
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tne Plunge into wnav. w«i2» a uot mcuium, « ■ ^" " . . "7 - ~ » 

Access Associations (the Encyclopedia of Associations on disc) and the 
American Library Association^ Directory of Library .and information 




oroarammersT The major applications for KAware on cd-rom have been 
SirfctoHes bibliographic databases and f ul 1 -text docu»nt collections . 

KAware s cd-rom authoring customers include the U.S. Government 
printing office, the world Bank, the University of California. Electric 
Power Risearch institute. UPS. Thomas Publishing, Nabisco, Merck * Company, 
Bristol -Meyers, NOAA, Naval Air Development center, the Food & Drug 
Administration Defense Technical information Center and many others. 

Packages KAware has separate ms-dos cd-rom build/retrieve packages 
for fielded data and for free text. Either one may be enhanced by »n image 
module that! supports Unking raster color and monochrome graphics (tlft and 
pcx) to text documents or fielded records. KAware Fielded and KAware 
Full -Text user interfaces share about 90% of their elands. There is one 
standard retrieval interface for KAware Fielded and two standard interfaces 
for Kaware full -Text, including the new Quick Search (search... see... 
print) user! interface intended to be less intimidating for novices. 

A windows version of the Kaware products is anticipated in the first 
quarter of 1993. The windows version of the retrieval system is being 
written for! cross-platform porting to Unix machines and the Macintosh. 

in the full -text product, basic text-search functionality is P^sent 
word searchl phrase search, proximity search. Boolean operations. W1 j d card 
and truncatjoh search, menu search, and hyperlinks to images and predefined 

° r ^Th^fufl-teJt^aware Disk Publisher build package requires the input 
files to bel stripped of word processing and typesetting formatting into 
plain asciil. A Kaware markup scheme is then used for fields, table of 
contents and hypertext links, as supported in the f^T-text system. 
Conversion modules process sgml and other coding schemes to translate the 
codes that have an impact on searching or screen appearance. 

Foreign character sets and other upper-ascn characters are evaluated 
for availability and location in the alphabetical sorting sequence. The 
full-text pfoduct is available with interface support for French, Spanish 

* nd ^example of the multilingual capabilities is the forthcoming Compact 
international Agricultural Research Library (ctarl), a 17-disc cd-rom set 
that uses trilingual KAware Full -Text/image retrieval software. Distributed 
by the consultative Group on international Agricultural Research in 
Washington, | DC, the set is an international compilation of 200,000 pages 
and 30,000 monochrome and color images. 

Fielded KAware Disk publisher requires the input files to be in 
cornna-delimlited format. A conversion program is supplied to convert trom 
left-taggedl to comma-delimited format, if necessary. Fields can be defined 
as containing one of 16 different data types, and can be searched within 
user-specified ranges. Records can be sorted for display and printing. 
Subsets of the database can be defined and selected for output or 
searching, and the set definitions can be saved. A typical KAware/Fi e I ded 
title is the Harris selectory. 1993 National and Regional Manufacturers 
Directorieslon Disc, available on cd-rom and on floppy disks. 

in addition to the KAware Disk Publisher and KAware retrieval 
software. Knowledge Access sells a cd-rom premastenng and simulation 
system. As do most current premastering systems, the KAware one can now 
write an "authored" application into iso 9660 format and onto a recordable 
cd on a write-once cd drive, where the disc can be used for in-house 
prototyping! or low-volume distribution. 
I 

i 

! 

! 
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KAware's seminar workbook. Publish on Disk gives prospective 
authors an excellent idea of the issues involved in using KAware, or other 
S-rom authoring packages, some of the content will, of course, have to be 
updated when the gui versions of the product become available 

Summary. Knowledge Access has provided two authoring systems Cfor text 
and fielded] data) that between them aim to accommodate much ^ the 
character-based and graphics data published on cd-r<>m. With the development 
of a am version under way, the user interface, which has the flavor or 
older online library database products, is presumably being made more 
aDDealinq to users who are not information retrieval professionals. Tne 
provision of Tsimple QuickSearch interface option for the text-oriented 
product was I an earlier step in this direction. 

KnowledgeSet: Knowledge Retrieval system. DeskTop DataPrep C D ™P> 

KnowledgeSet offers DeskTop Dataprep (DTDP) authoring tools, Knowledge 
Retrieval System (KRS) retrieval software, and KRSAPI, an application 
interface that permits custom-developing a user retrieval shell. 

KRS and DTDP software is designed to prepare and deliver ca-rom-basea 
technical documentation, proceedings of technical conferences, catalogs and 
context-sensitive help systems across a wide range of delivery platforms, 
dos, Mac and Unix, with authoring on a sun spare Unix workstation or a 386 

d ° S ^Because of its support for cad/cam graphics formats and tech-doc work 
in general,! KnowledgeSet is the kind of software that might be used to 
produce cd-rom maintenance manuals for jetliners, for example. 

KnowledgeSet 1s a text-and-graphics package, with multimedia audio and 




pict, usefuil for technical drawings. The files can be encrypted for 
security or to protect copyrighted data. Hyperlinks from points in graphics 
are not yet supported, but are planned- 

European and Asian character sets are supported- Fonts and character 
attributes are preserved. Straight ascii text can be directly input prior 
to markup; rtf, Frame, Interleaf and sgml can be filtered. Full support for 
sgml will be available in 1993. PostScript support is planned, but at 
present pagles are not represented as such. 

Graphits and text can be viewed together or separately, depending on 
the platform, searches can be done across multiple documents on multiple 
databases. Relevance ranking is determined by number of matches within a 
document. a! dictionary is provided to identify terms in the database and 
ensure proper spelling. Proximity and Boolean searches are supported. 

some Hyperlinks are pre-indexed, such as references, citations and a 
table of contents outline. Linked lists of articles resulting from a 
search, history and path lists, notes and bookmarks are created on the fly 
and can be saved, 

searches can be limited to identified fields: headings only, text, 
footnotes, bibliography, etc. within the fields supported in the text 
documents. Boolean searches within and across fields are supported, but 
range searching for dates and numbers is not. 

The user interfaces feature pulldown menus for experts, buttons for 
the casual User. Navigation aids include citations, references, outlines, 
bookmarks, {notes, history files, path files and save queries. Non-English 
versions of the user interface can easily be created. 

Summary. This package has been optimized for technical documentation. 
Support forf typography and graphics, including vector graphics file 
formats, is good and getting better, with full sgml and PostScript support 
planned. I 

MicroRletrieval : Re:search 

MicroRetrieval is a new company that in 1992 purchased the assets or 
Retrieval "Technologies, the firm that developed Re:Search, an ms-dos 
full -text apd image cd-rom data preparation and retrieval package. 

MicroRJetrieval f s Re:5earch search Module is organized around 
"catalogs" of text documents and images. Paae breaks in the text and text 
page formatting can be preserved, along with character attributes, but not 
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fonts. Fielded records may be defined, with fields supporting date, numeric 
and tex^dag. can a .y elevan c y ranked" list of documents. 

The relevance ranking is based on a simple frequency count of hits in a 
given document. The listing of documents can be toggled between a 
relevance-ranked sort and an alphabetical sort.. . 

Hyperlinks will be supported at the beginning of 1993: word to word, 
phrase to phrase, and from images to text and text to images 

Red Book Audio (cd digital audio) may be linked to a text or image 
line and searched on by description tag. Motion video will be supported in 

the forthcoming windows product. ... ^ „^ . 

An api Itoolkit is available for customizing the user front end. 
The do4 Build Module allows the developer to: 

* create a hierarchy of documents; 

* Tag images, audio and fields; 

* Title text files; 

* criate paths; 

* Modify the stop list; and 
*• index 

Retrieval Technolgies ' zero-runtime-royalty pricing of Research 
retrieval software, continued by MlcroRetneval , is probably one of the 
factors in the use of Research by a number of government agencies, 
organizations, publishers and corporate clients: The 3o1nt Chiefs of Statr, 
theu S. Department of Agriculture, the IEEE (for conference papers). 
American iniights (technology and patent information), wra c. Brown 
Publishers (Multimedia College Biology Textbook on Cd-rora), Coopers & 
Lybrand (resumes). Cotton Incorporated (Textiles and Patterns), Fidelity 
Investments! (marketing articles) and others. 

a Windows version of both the build and retrieval modules is expected 
in the first quarter of 1993. The windows program will support 24-bit color 
images and will also have a version that complies with the cd T rxx standard 
for distributing data on cd-rom that is decoupled from a particular user 

lnter summary. Research is designed to find documents relevant to search 
criteria, using defined fields and free text, and to display Image as well 
as character data. Its pricing, with zero runtime royalties, differentiates 
it from other packages more than its features, though its straightforward 
user interface also makes it appealing. 
Nimbus I information Systems: Romware 

How did a British audiophile record company spawn a cd-rom authoring 
package marketed from Virginia? It went something like this: producing 
analog lp records of highest quality led to manufacturing audio compact 
discs. Hop across the "pond" and set up a cd plant near Charlottesville. 
From cd production, why not cd-rom? Set up a cd-rom division. To support m 
and encourage cd-rom publishers, develop authoring tools. There you are, in 
an antebellum farmhouse in Ruckersvllle, va, next to a cd factory. The 
authoring and retrieval software, called Romware, and the Nitribus 
replication! facility's production services are sold separately, one can be 
used without the other. . ^ t 

Romware is another zero-runtime-royalty product, one of the first to 
be priced with that flexibility. Romware accepts and searches free-form 
text, can store information about its logical structure (toc T etc.), and 
also permits definition and manipulation of structured records with defined 
fields. As the Nimbus product sheet says, "Not all database management 
systems caniwork best with all types of data. But Romware is designed to 
work with all of your data." ... 

Electronic publishers can prepare data with the Romware Database Build 
and mdexind System, and put the Romware DBServ Retrieval Engine for dos 
and Windowsfon the discs. If they like the standard user interface, RW30, 
they can put that on the disc, too. if not, Nimbus might have another user 
interface oh the shelf, since several have been developed. If not, the 
cd-rom developers can program their own "client" interface that can 
communicate! (via Romware's query language) with the Romware database server 
software . ! 

i 
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Romware retrieval software was designed to run directly from the 
cd-rom, without using space for temporary files, index files, or the 
retrieval modules themselves on the hard disk. It was also written to run 
acceptably on ancient pes, though of course it likes faster machines, more 
memory and tontemporary display technology. 

Romware fields in structured records can be defined text, numeric or 
date. Text fields can be indexed by word or by the whole entry. Dates can 
be entered jin one of 11 formats. Free-form text documents can flow the text 
with word wrap or preserve the author's line breaks. Graphics are stored in 
PCX and uncompressed tiff graphics files, audio as Red Book audio. 

Full-text indexing is possible, using a stopword list that can be 
modified by! the author. Relevance ranking and fuzzy logic are not 
supported/Word stemning is enabled, however, and proximity searches can be 
processed at the word, sentence or paragraph level by the database server, 
i.e. F the retrieval engine, but not yet with the standard interface (RW30) . 
Compiex Boolean searching, including nesting and set operations, is 
supported ~ f° r those who can keep track of parentheses. 

Fielded data can be searched by Boolean queries, too, and data in a 
search set may be sorted on up to nine fields. Multi-valued fields permit 

Suben on-theLfly "hypersearches" can be performed to find and display other 
records containing the same terms as the original record, in the same 
fields. Less constrained searches for related material can look in other 

fields as well. . ■_. ^ 

The author can modify RW30 function keys and menu options within the 
standard interface and standard authoring tools. 

some of Romware's other features: multilingual support, encryption, 
data compression, import utilities that can read dBase ill.dbf files or 
fixed-length record files and selection among ten user-defined character 
sets in anyi one field. 

Nimbus, publishes a periodically revised demo cd-rom called Roiwvare 
Magazine. A recent "issue" contains a movie database with more than 1,700 
film entries, a cd-rom product catalog, a database of responses to opinion 
polls, a portion of the National Trade Data Bank, some (unclassified) Army 
Logistics inventory data, an example of a Romware textbook interface, 
Bookface (from the Life and work of Sir Isaac Newton), and some programming 
and cd-rom utilities. The variety of sample material is designed to invite 
evaluation bf the search engine and to give prospective authors a look at 
different interface possibilities, beyond RW30. The film database, for 
example, can be viewed with either a dos or windows Romware user interface. 

Summary. Romware, another package with no per-disc royalty fees 
(although tfiere are per-title fees for the build tools), can be used well 
for reference works that are composed of structured records, searchable 
text and pictures. Romware runs from the cd-rom drive alone; the retrieval 
database server has been designed to accept multiple user interfaces. 

Ntergaid: Hyperwriter/Hyper Reader 

Ntergajid was founded in 1987 to develop hypertext authoring tools. 
Based in Fairfield, CT f Ntergaid has ten employees, its first product was 
called Black Magic. 

Remembering Ntergaid 's product names is easy: Hype rWr iter is the data 
preparation! software; Hyper Reader is the retrieval software. 
Hype rWriter (-based cd-roms include the 1991 and 1992 Microsoft Cd-rom Show 
discs and the PC SIC Cd-rom Compendium, 11th Edition. Someone obviously 
thinks that! Hyperwriter/HyperReader, fnore commonly used on magnetic media 
and over networks, is a serious cd-rom data preparation and retrieval 
software contender. 

In addition to Hyperwriter/HyperReader and related software, Ntergaid 
offers Hyperwriter for Training, an authoring tool for computer-based 
training thfrt adds testing, grading, reporting and student management to 



Hyperwriter 



Ntergaid 's packages are graphical, object-oriented programs that make 
some of the 



Hyperwriter 



hypermedia authoring package. 



other cd-rom retrieval and authoring tools seem dated, 
applications can involve large quantities of text data, plus, 



as required!, graphics , audio, video, and animation. Text can appear in 



I 
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scrolling d'r card form, both within the same document if desired. A 
"document" lean in theory be up to 136 gigabytes. -inrluAina 

Hvoerllnks can lead to text, to graphics and to actions, including 
-soawSinq" Eternal applications and device controllers. Links can be 
bia^rt?oriai: [inks^an lead from any area of a graphic image and can lead 
to "attributes'* assigned to tent, such as author, date created and access 
Hahts Lite can lead to information in a pop-up window. Multi-level 
Solvents *an bTcreated, serving users witndiffering needs or 
backgrounds and permit the creation 

of a user interface to fit the application, with multiple windows, buttons, 
control pyelographies and menus, in botli dos and windows stylesheets 
can specify textfohts, colors, justification ^d positioning. „ 

Navigation tools include graphical mapping of the document, 
bookmarkinl, infinite retrace, full -text indexing and standard Boolean 

SearC SJpeririter directly imports ascil, WordPerfect and Ventura tagged 
text The recently released Hyperwriter AutoLinker module converts, in 
addiiiJn? SicrSoft word for m|-dos Ventura ^Ift'^^Jff'SBl is 
Markup, and a limited subset of sgml markup. One of the AutoLinicer too is is 
rhp lannuaOe HvoerAwk. which can be used for creating 
JvSer^adet-co^atible documents and links from source documents. Some 
HvSIrAwk code can be generated automatically by an automated programming 
<£r?iiw within AutoLinker Ntergald claims that the AutoLinker tools for 
Snver&nS larSe qSSKSies of St files into HyperwHter/Reader format 
wd indexTng tKem will make the wider use of Ntergaid's software for 
databases of cd-rom scale much more feasible- 

databases developer . s kl t sel]s for $1,595 Note: there are no 
oer-title or per-unit royalty fees for titles published with Hyperwriter 
and distributed with HyperReader. Furthermore, Hyperwriter creates 
cross-platferm compatible dos and Windows hypermedia titles Both dos and 
windows retrieval software can be placed on the same d "c. with a single 
set of data files. Mac and Unix retrieval packages are planned for delivery 

later summary ea The Ntergaid authoring system supports almost every 
imaginable! kind of hyperlink among text elements and rel ate ^ information, 
it would be useful for manuals, catalogs, encyclopedias -- any "tie with 
complex cross-referencing, its up-to-date and customizable user interface, 
multimedia! support and text file conversion utilities, combined with its 
aggressive' pricing, earn it at least a look for almost any title. 

Online computer Systems: opti-ware „ . 

onlinfe Computer systems was founded 1n 1979 to help corporations and 
oovernmentragencies master the electronic delivery of information, 
currently there are more than 100 employees. Hundreds of cd-rom titles have 
been developed with the opti-ware tools and nearly 200.000 copies of the 
opti-ware retrieval software have been distributed. But don t J£°£ for a 
copy of opti-ware authoring tools on the shelves of your neighborhood 
Eggnead store- . - , _ ^ . . 

onlins's Opti-ware is a powerful set of tools for creating and 
retrievingfed-rom titles. But 1t is not a. commercial package for the 
do-it-yourselfer. The opti-ware tool set in reality is not a ,.P a ckaged 
"authoring! system." What Online does for clients wishing to author their 
own titles is to custom design the authoring system, selecting. the relevant 
user interface and subset of data preparation tools for a particular 
customer and application. , . ^ _„„ . %< „ fF „« 

The authoring platforms are vax/vms. Unix systems, or IBM **i frames 
under mvs.j The delivery platforms include dos, Windows, os/2 PM, Mac and 
Unix/x window. , ^ 

As tht price chart shows, this service carries a major-league price 
tag, but it can produce major-league products: * Bibliographic and other 
fielded-daEa reference products such as 

eowkeir's Books in Print, variety's Video, or the Library of Congress 
co-Marc/Names, which contains millions of records); * Technical 
documentation systems, such as GTE maintenance documentation; * Multimedia 
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encyclopedias and other full -text products t such as . 

Grolfer's Multimedia Encyclopedia, scientific American Medicine; and * 
Illustrated parts and products catalogs, such as the whirlpool P*rts 
cataloq, or the Cahners compute r-Aided Product selection system (CAPS) f 
which sparjs 47 cd-rom discs. (Opti-Ware permits cd-rom databases to be 
distributed over multiple drives and jukeboxes.) 

Robust product, online can deal with an extremely wide range or text, 
graphics alnd multimedia formats, non-English character sets, page 
description languages, compression standards, encryption options, etc. 

Retrieval features include full -text search, proximity searching. 
Boolean searching and user-definable thesauri. Hyperlinks, both preindexed 
and user-generated on-the-fly, are supported. Retrieval software can use 
the logical hierarchy of a document for navigation and searching, opti-ware 
also supports Boolean searches through fielded data and range searches 
through numeric data. . „ , . _ ,„ , . . ^ . 

opti-|Ware has developed special graphic features for applications such 
as parts catalogs, where exploded-view diagrams are linked to parts-list 
text dataJ a part number can be exported to an external application from 
the cataldg page. Multimedia elements are linked to specific text sections, 
but their jindexes can be separately browsed or searched. 

summary. Not really one packaged product, opti-ware is a large family 
jrhorina rools and retrieval routines from which online selects to 




come 

cneapT'and'the^authoring stations require more powerful" hardware than a pc 
or a Mac. j 

Personal Library software: 

Personal Librarian 

personal Library software, headquartered in Rockville, md, was rounded 
in 1983 td bring to market full -text document management and electronic 
publishing technologies that were emerging from university-based research, 
such as that of PLS founder Matthew B. Ko Tl . The first product was SIRE, a 
full -text I retrieval product using toll's search heuristics that ran on pc. 
vax and Unix platforms. SIRE was the direct ancestor of Personal Librarian, 
which was (introduced in 1986. Personal Librarian became one of the first 
Windows packages with WPL (Windows Personal Librarian) in 1988. Last summer 
pls released Document Manager System, a package for in-house use that 
marries its full-text search tools with ocr and document imaging. The OMS 
program competes with similar recently introduced comprehensive 
page-image-plus-text document-management packages from full-text 
competitors such as ZyLab. Executive Technologies and a growing number of 
document imaging vendors. . 

Personal Librarian, in its dos. Windows, Mac, Unix and vms flavors, is 
designed tto manage large libraries of text documents, such as those on 
cd-rom, for information dissemination or for internal use. Personal 
Librarianlcan link fielded and unstructured information and images to 
documents J It supports hyperlinks between documents, and it enables 
full-text Isearch of conventional kinds. 

Added value: concept search- The real power of Personal Librarian, 
however, is its ability to help the user find as quickly and as accurately 
as possible the documents in a database that are most relevant to a given 
search interest. The basic problem that Personal Librarian addresses is the 
classic "You don't know what you are missing," the notorious failure of 
conventional Boolean full -text search methods to retrieve most of the 
documents,; in a collection of periodical articles or resumes, for example, 
that are rfeally relevant to a searcher's intent, . 

Ther^ is growing academic literature on the subject of developing 
"smart search" procedures and a number of approaches and variations on 
approaches under way* Already noted, SmarTrieve, Compton's NewMedia 
product, combines several methods, including one based on a semantic 
network, a super thesaurus that tries to map how terms are conceptually 
related tcf one another in hierarchies of relationships, search Express, 
from Executive Technologies, relies on manual weighting of the importance 
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*f Mrh of ias manv as 15 search terms by the searcher, a simple, but 
SltlSSall^ffectiSe, approach to going beyond Boolean and, or, not logic. 
^ ten Se?U^d fro^ software that was^nown as sire, Personal Librarian does 
not require a predeveloped thesaurus or semantic nengrk. or ™^" a l d 
weightings.! Rather, it uses statistics der ^ ed ^J r0W .^ e ,. ™II doSmlnts and 
itself: fioW often terms occur and co-occur in the library ?f ^f^ 3 "" 
how closely Cwith what proximity) they do so. some of the techniques are 

thCSe i. weight rare terms 1n a search query more heavily than words or 
phrases wiifi a high frequency of occurrence in the database. 

2 Weight co-occurrence of two different search terms in tne same 
document more heavily than the occurrence of only one. „„„,,. m ,. r% , ftf on - 

3 Weiqht the co-occurrence of search terms in close proximity of one 
another more highly than if they are far apart in the document . 

4. to expand a search to a broader "conceptual level, list tor tne 
user iords that the Personal Librarian discovers to have a pattern of 
afsocSJio^ with She user's requested word(s) and ask which words should be 
added to the search list. Alternatively, broaden the search by 
automatically adding all of the new-found words to the search query, 
automatical^ aoo ^ is to nave th e.user select the most 

relevant documents or paragraphs from those initially identified and then 

to let the I software extract words from them to use as query terms for 

further seirchXg Each successive search produces a relevancy- ranked group 
of documents screened by procedures such as #1, #2 and #j - 

Cd-rorfi titles published with personal Librarian software incluae tne 
Library of ICongress American Memories Collection, the ABC News Disc La 
listinq of' available news footage), the U.S. code (published by the House 
of Recitatives) , NIH grant*app1 i cations , the EPA Risk .^essment 
Library, tne Financial Times, The Economist, and technical documentation 
for companies such as Bull, Unisys and ICU- __ n annroac h 

Summary. This full -text search program provides » statistical approacn 
to concept [searching and relevance ranking that can efficiently find 
relevant documents or paragraphs when standard Boolean queries are 
ineffective. 

Textwkre corporation: Textware . . 

TextwSre, founded in 1989. is headquartered in Park City, Utah, 
privately held, it has ten employees. , , „ 

Textware is a full -text indexing and retrieval package with a pcx 
image viewer. Textware is often used on magnetic hard disk drives as a text 
database mSnager for individuals or workgroups, but it is also a cd-rom 
author ing^too ..^ ^ metaphor to describe collections of 
documents. I A "card" in a Textware CardFile can be a paragraph a page an 
entire document, or a database record with defined fields. Each card is 
given a IVcharacter Cmaximum) header that labels it in search hit lists. 
y Both {indexing and retrieval portions of the package are available for 
dos and Macintosh system 7. pulldown menus, dialog boxes and mouse support 
are available in the character-based dos user interface; the Mac version 
utilizes aj familiar Mac graphical user interface. nr .^ nn 

Example customers include the Federal Deposit Insurance corporation, 
which uses Textware to publish quarterly bank performance reports on 
cd-rom. Quanta Press, one of the cd-rom publishing pioneers, has authored 
titles with Textware ranging from the About Cows disc to the CIA world 
Factbook oh cd-rom. Also published as Textware cardFiles are Way2ata 
Technology's Pi ace-Name index on Cd-rom and Disc to the Future, a 
collection! of more than 200 megabytes of programs and utilities tor Mac 

pr09ra ^ er -u xtWare Retr ieve only software can be included on Mac and dos 
cd-rom disks. A royalty-free retrieval module, Textware Lite, is available 
for dos only. Ca Mac version is expected in the second half of this year.; 
Textware dite contains a subset or Textware Retrieve only features ~ tne 
Lite program cannot access or search across multiple CardFiles 
simultaneously and can only output blocked text to disk or printer. 

Files"converted to Textware's "ascii-like" format can be compressed to 



i 
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about 50%!of their original size. Textware indexes usually are only 10-20% 
of the size of the original text files. . 

imag^ support. Images in other than the pcx format, plus multimedia 
data of all kinds, can be viewed by automatic spawning of the appropriate 
external viewer or player program. Image files, but not specific points in 
the pictures, can be linked to text. WordPerfect and word files with 
graphics can be imported, preserving the links between text and graphics. 
Files from these two word processors can be indexed without conversion. 

indexing/searching. Fielded records are imported into templates and 
users can (search across one or more fields. Fielded data searches are 
limited to simple word searches, with wildcards. 

FulVtext searches are aided by a word wheel- A search dictionary, 
sticky notes and bookmarks are also available. Boolean, phrase, proximity 
and wildcard searches are supported. Boolean operators that Textware 
recognized are and, or, xor, not, andnot and ornot. Proximity can be 
specified |to reflect order and closeness. 

Searches yield a hit list of cards that contain either the search 
criteria 6r synonyms, users can jump to the cards, which display the found 
search wortds in highlighted form. They can scroll through a card s text or 
lump to another card on the hit list. With Textware Retrieve Only, blocked 
text, individual cards or all cards on the hit list can be ouput to disk, 
printer ort memory clipboard. 

NO ntpn-English retrieval shells are available off the shelf, but a 
Textware Toolkit enables a programmer to attach a custom user interface to 
Textware 's C-language subroutines. 

Many I common word processing formats can be imported into Textware. 
Automated I file conversions are supported, including indexing parameters, 
but a markup langage for manually preparing files is also provided. Text 
formatting, including bold and underline attributes, can be retained. Text 
links can! be defined between one or more words and another word, several 
words in sequence, one or more images, another executable program, a sticky 
note or a I bookmark. 

card^ can be linked to one another in a CardFile to form groups, which 
can include up to 8,000 cards, a group could be a sequence of cards making 
up a section of a book, for example, or it could be a logical grouping of 
paragraphs relating to a particular topic, dispersed throughout the 
CardFile or document database. 

The CardFile structure and Textware's unique search capabilities make 
it a cd-rom authoring package to consider for a variety of text database 
types. j 

TMS : Inner View 

Founded in 1981, TMS is one of the pioneers of electronic publishing 
technology, tms demonstrated the use of optical discs for text databases 
ten years! ago, developed one of the first hypertext implementations and 
played an important role in the specification of the High sierra/iso 9660 
cd-rom formatting standard, tms also developed one of the first 
all-software image compression/decompression toolkits, which it licenses 
widely tojthe electronic publishing and document image processing 
industries. 

The tws innerview product family integrates image display, hyperlinks 
and full-tfext searching, as expected from its heritage, page image display 
and navigation from compressed raster-scanned documents is supported more 
fully thaii in most competing packages. 

innerview can manage the content of many books on one cd-rom. It has 
been custpmized for Pratt & Whitney jet engine maintenance documentation, 
General Dynamics technical support system Qwith engineering drawings up to 
H size), Price waterhouse laptop cd-rom resource and lan-access cd-rom tax 
reference database. Arthur Andersen uses tnnerView to publish more than 100 
reference | books on cd-rom for its field auditors. NILS uses innerview to 
prepare insurance law and regulation databases on all SO states for cd-rom 
distribution to the insurance industry. 

one authoring/multiple retrieval packages. The innerview authoring 
module is leal led Innerview Database Preparation Software. The retrieval 
software dan be innerview Retrieval Software, for windows and Macintosh, or 
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it can be tt|e simpler Quickview package for Windows. A sof^re toolkit. 
MasterView, fallows developers to integrate full-text search, imaging or 



hypertext irtto Windows applications. im ^ MC c^ imanf> 

InnerView directly supports numerous compressed and uncompressed ™age 





structure of a document can constrain searches, as defined by the autnor 
during design. Hyperlinks, set up to naviaate from the table of contents, 
are also used to display annotations, bookmarks, text and image 
cross-references and a search audit trail. i 
Full -text indexed pages can look like pages, except that images linked 
to the pages are stored and displayed in a separate window. Text and 
qraphics wiijidows can be displayed simultaneously. Audio and motion video 
can be "spaitned" via external viewers. Text compression is optional. 
Enervation and password protection may be provided, users can mark text ror 

locating 

searched^byfsin^ phraseT Booiean operators, proximity or wildcard- 

It may also! be conducted against a hit list defined by earlier search 
terms Any or all of the books in the information database can be 
searched . The ranking of documents in the hit list can be by number of 
hits or by location in the database. All hyperlinks, predefined by the 
author . intlude text-to-text, text-to-image, i mage -to-i mage and 
image-to-text functions. Text fields can be searched, both within and 
across fields, but range searching and reporting based on the fielded data 
are not supported. _ . 

Of theitwo Windows retrieval shells t innerView Retrieval uses the 
standard windows interface, while Quickview uses buttons- Non-English 
interfaces are supported, and customization of the user interface can be 
accomplished with the MasterView toolkit or with programming services from 
TMS 

Sumnary* InnerView has most of the text and image retrieval tools one 
could ask for, except for sophisticated concept searching. It also can 
display the! documents in ways that more closely resemble printed pages than 
much of the! competition, in a new version of innerview for Windows that is 
being released in the near future, the effort that has gone into designing 
the user interface for the retrieval software has now been expended in the 
case of the] authoring tool , which is still rare in the industry. The 
easy-to-use authoring software will be called TMS publisher (see photo)- 
TMS Publisher, expected to be shipped early in the second quarter of 1993 , 
will also add text conversion utilities for converting common word 
processing files to sgml , whence it can be converted to the innerview 
format. I 

voyager; Expanded Book Toolkit 

voyager's Expanded Book Toolkit and the retrieval software Voyager 
calls The Library, combined with HyperCard, present an approach to 
electronic publishing on and for the Macintosh that explicitly builds from 
the book and page model, expanding it, as the name implies, in useful 
directions by incorporating text search, hyperlink and multimedia features, 
output is designed for the screen, rather than for printing. 

Voyager was founded in 1985, Its first products were the criterion 
Collection Videodiscs that transfer important films to a video format with 
great care and impressive technical quality^ often accomplishing what 
amounts to a restoration in the process, criterion videodiscs also add 
supplementary audio or video material about the film and its creation. 
Beyond the triterion movies, voyager publishes pioneering multimedia 
videodisc afid computer software titles on topics such as music, art, 
history, current events and travel. 

Just as the Criterion Collection raised movies for video players to 
the level of a new art form, the expanded Books series attempts the same 
for electronic books. Expanded Books are book titles distributed on digital 
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media Cff<Wy disks, with the first cd-rom titles expected soon), designed 
with the Macintosh powerBook as the intended delivery platform. 



The emdhasis is on using electronic techniques such as full -text 
,.,^xing, hyperlinks, audio and graphics to enhance the book format, wr 
retaining as! many features of printed books as possible. Such features 



include: full typographic richness of a printed page, ways that readers 
mark and ann'otate conventional books, such as writing or drawing lines in 
the margins,! highlighting, inserting bookmarks, turning down page corners, 
etc The PowerBook made it possible not only to mimic the printed page, but 
for the entijre delivery platform to approach the portability of an 
ink-on-oaper volume. - . _ 

The Expanded Book Toolkit is the software that Voyager developed to 
prepare its lown Hypercard-based Expanded Book titles, voyager is now trying 
to leverage jits software development by licensing a toolkit to other 
publishers. ^ . 

Toolkit! features. The data that appear on the disc are in HyperCard 
stacks, character sets supported by HyperCard are displayable, along with 
QuickTime audio and video, co-Audio, and aiff and "snd' audio resources. 
Graphics are stored as pi ct files. 

The authoring process supported by the voyager Expanded Book Too I Kit 
Involves importing text chapter by chapter from word processing files, 
adjusting the appearance of pages and correcting errors, attaching 
annotations, 1 and, in the case of cd-rom, submitting a disk for 
premastering. ... r , . . 

Text searches use HyperCard's "find functions, finding next or 
previous occurrences of any word, a list of pages on which the word 
appears, and a list of the word in context. Multiple-word searches can be 
conducted, yielding a list of pages on which the words appear together. 
"Find" lists can be saved for later retrieval. 

Hyperlinks both within a book and between books are supported, m 
creating interesting possibilities for multiple-book cd~roms, HyperLinks 
can join text to text, pictures, Macintosh audio, cd-audio, videodisc or 
QuickTime. They also can spawn another application via a HyperTalk do 
statement. ! . 

Authors create the links manually. Users can save their find lists and 
can electronically turn down page corners, apply paper clips, write black 
lines in the margin, write notes in a notebook and write in the margin of 
the page. A :1 arge-print feature can change the print size- 

The 3l!available Voyager Expanded Books titles were published on 
floppy disk.! The firm is now planning its first cd-rom titles. 

Summary, voyager's approach to portable electronic books focuses on 
the user interface of the retrieval package, which is designed to mimic 
electronically with an Apple PowerBook the experience of reading a printed 
book. The software still has a unique feel to it, although other products 
for Mac and (windows are evolving toward similar approaches. As a 
floppy-based product, the user-annotation features are less cumbersome to 
implement than with a read-only cd-rom as the distribution disc. 
ZyLab Division of 101: zylndex 

ZyLab claims that zylndex, created in 1983. was the first pc T based 
text retrieval software package, zylndex for dos and windows are indeed 
well known. jzyLab says that 100 clients have used zylndex to index cd-rom 
and worm databases. 

Zyimag^ is a new document imaging package that automatically indexes 
scanned pages by first translating the captured images into 
machine-readable characters via ocr (optical character recognition) and 
then applying Zylndex full -text indexing software to the resulting text 
files. ! 

Either zylndex or Zylmage can be used as cd-rom authoring tools. 

The Zyilab products do not compete as multimedia authoring systems, but 
rather as document managers. One significant advantage of the ZyLab 
packages is their capability to index, read and display documents in their 
native file formats. Page breaks and line breaks are preserved, but font 
and character attribute codes are ignored. Graphics can be linked or 
embedded in jthe pages for viewing or printing. Not as optimized for cd-rom 



i 



PAGE 63/64' RCVD AT 8/29/2006 3:08:36 PM [Eastern Daylight Time] * SVR:USPT0^ FXRF-5/9 * DNIS:2738300 * CSID: 1 650 326 2422 * DURATION (mm-ss):15-06 



AUG. 29. 2006 12:18PM TTC-PA 650-326-2422 NO. 7948 P. 64 



as some software, the zyLab products' Performance on cd-rom can be proved 
by downloading the indexes and retrieval software to the hard °isk. ™e 
space needed for the index will be about 35% of the size of the text files, 
or potentially as much as 200 megabytes per cd-rom disc. indeXinQ 
The Zvindex full -text search software supports full -text indexing, 
modifications bj the author to the stop list, multiple document searcfi. 
KKilSv JSrcK Boolean search and scrollable vocabulary windows, a 
thesaurus 3n be used to broaden a search. A form of concept searching is 

a1S ° T St-3-?ext. graphics-to-text, graphics-to ; graphics bookmarks and 
notes links Imav be defined, preindexed or generated on the fly. Zyimage 
TSSSJt callHlnks text to graphic files such as panned images of pages, 
fection healers are .marked up in the source document and displayed by 
zvindex. ZvlLab promises sgml markup will be supported in late 1993. 

V Fields! may be defined and searched, including by range A report can 
be genera?!* snowing kwic, keyword in context, for all the documents that 
are "hits" by the search criteria. Graphics can be zoomed, rotated, 
enlarged, scrolled and stretched. Hyperlinks are even supported from 

specific locations in a graphics file. «.„-in 1 ,„ ,„ T nH»y where 

K a zylnSex or Zylmage database is constructed by telling ^ylndex where 
the documents are located. Zylndex reads the file and builds the index 
automatically. The files can reside in multiple locations and be pulled 
together for later compilation on the final submission medium tor 

preMa !ndeli?g speed was benchmarked by ZyLab at 20 mb per hour on a '486 
pc. zylndex] supports simultaneous dos and Windows front ends to the same 
set o i data i and index- i»- -^n 

Summary. The typical use of Zylndex and Zylmage is ln-house archival 
of text documents and the creation of online manuals. The software is not 
well optimized for cd-rom, but it is inexpensive. 
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